RTS Model V2 Performance Analysis

TODO:

  • Possible to remove extra influence of multiple RTS within one tile?
  • Figure out why my IoU scores don’t quite match Yili’s - I’ve spent some time on this and am still uncertain.
  • Facet the maps of the polygons instead of looping over them
  • Add legend for the validation and prediction polygons
  • Add statistical tests for boxplots
  • Shapley analysis

Set-Up

Load Libraries

Prep Google Drive Authentication

Define Functions

assign_conf_stars

avg_precision

bbox

eval_expression

get_rast_id

googledrive_download

input_as_df

import_pred

make_my_dir

plot_prediction

plot_zonal_stats

pred_as_poly

recall_precision

trim_outliers

val_as_poly

Prep Plot Variables

Load Data

Polygons

## Reading layer `rts_polygons_for_Yili_May_2022_v2' from data source 
##   `/home/hrodenhizer/Documents/permafrost_pathways/rts_mapping/rts_data_comparison/data/rts_polygons/rts_polygons_for_Yili_May_2022_v2.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 138 features and 11 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -139.218 ymin: 68.99102 xmax: 124.304 ymax: 72.99794
## Geodetic CRS:  WGS 84

Download Files

## [1] "There are 69 prediction tiles."
## [1] "There are 138 input data tiles."

Maxar GeoTiffs

Planet GeoTiffs

Sentinel GeoTiffs

Get Tile Bounding Boxes

Convert Predictions to Vector

Join Prediction Polygons into polys SF Dataframe

Convert Validation to Vector

Join Validation Polygons into polys SF Dataframe

Convert Input Data to DF

Interactive Map of Features

IoU

Calculate Intersection and Union

Calculate IoU

## # A tibble: 3 × 2
##   imagery    mean_iou
##   <chr>         <dbl>
## 1 Maxar          0.66
## 2 Planet         0.64
## 3 Sentinel-2     0.64

Mean Average Precision

Recall-Precision Curves

MAP

Plot

Precision measures false positives. Recall measures false negatives.

This figure will only be interesting once we have added in negative training data.

Performance by Feature Size

Calculate Area

Size Distribution by Region

## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries

## # A tibble: 2 × 5
##   yg          mean_size min_size max_size median_size
##   <chr>           <dbl>    <dbl>    <dbl>       <dbl>
## 1 Other          20702.      484   107280       10484
## 2 Yamal/Gydan    11290.      512    47548        5732

Plot

Raw IoU Scores:

This is complicated by the fact that the rts_area column is calculated from the raster validation layer, which may contain several RTS features within one tile. Use rts_area (from the original RTS delineation), instead.

Run nls models and bootstrap parameters

Bootstrap predictions for plotting the nls models

Plot the Size/Performance plot

Active vs. General RTS Performance

## # A tibble: 3 × 7
##   imagery    p_val x_pos star_y_pos label_y_pos p_label         star_label
##   <fct>      <dbl> <dbl>      <dbl>       <dbl> <chr>           <chr>     
## 1 Maxar      0.962   1.5        0.9        0.95 p-value = 0.962 ""        
## 2 Planet     0.458   1.5        0.9        0.95 p-value = 0.458 ""        
## 3 Sentinel-2 0.528   1.5        0.9        0.95 p-value = 0.528 ""

Plot

It is possible to get rid of the inner panel borders, if I decide that looks better: https://stackoverflow.com/questions/46220242/ggplot2-outside-panel-border-when-using-facet

Drivers of Unexpected RTS Prediction Performance

Classify Features Using Confidence Interval Approach

This approach first uses the 95% CI of the model parameters to determine whether RTS features were predicted better or worse than expected based on the model. Next, the threshold at which RTS size doesn’t impact IoU is determined from where the slope of the model approaches 0 (currently using slope < 1e-06) for each imagery type. RTS features smaller than this threshold that were predicted better than expected are analyzed later to determine why some small RTS can be identified from the imagery.

Zonal Statistics

Plot

These plots summarize the input data values (mean or standard deviation) in RTS cells, background cells, and the normalized difference between the two (Delta = (RTS - Background)/Background). Most of the input layer names should be self explanatory, but for the others:

lum = luminance = 0.299*r + 0.587*g + 0.114*b

sr = shaded relief

A few takeaway points:

  • NIR and NDVI (mean) figures indicate that RTS are predicted poorly where background cells have low NDVI/plant growth (plants reflect NIR, so have a higher NIR value). I.e. it’s harder for the model to find the RTS where there’s not much plant growth across the landscape. Also interesting on this point is that the standard deviation of red and green bands are higher in poorly predicted RTS (Maxar data only), perhaps indicating that patchier vegetation makes it harder for the model to find RTS.
  • NDWI (mean) figure indicates that RTS features are predicted better where it’s drier (fewer lakes?).
  • Elevation (mean): I still need to think about this, because what this means probably depends on how exactly the elevation was normalized. The standard deviation of elevation figure indicates that where there is higher variability in elevation within RTS relative to the background cells, RTS features are identified more easily. I.e. smooth landscape with uneven RTS is easy for the model to find.